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For bacterial communities containing hundreds to thousands of distinct populations, 
connecting functional processes and environmental dynamics at high taxonomic 
resolution has remained challenging. Here we use the expression of ribosomal proteins 
(%RP) as a proxy for in situ activity of 200 taxa within 20 metatranscriptomic samples in 
a coastal ocean time series encompassing both seasonal variability and diel dynamics. 
%RP patterns grouped the taxa into seven activity clusters with distinct profiles in 
functional gene expression and correlations with environmental gradients. Clusters 1-3 
had their highest potential activity in the winter and fall, and included some of the 
most active taxa, while Clusters 4-7 had their highest potential activity in the spring and 
summer. Cluster 1 taxa were characterized by gene expression for motility and complex 
carbohydrate degradation (dominated by Gammaproteobacteria and Bacteroidetes), and 
Cluster 2 taxa by transcription of genes for amino acid and aromatic compound 
metabolism and aerobic anoxygenic phototrophy (Roseobacter). Other activity clusters 
were enriched in transcripts for proteorhodopsin and methylotrophy (Cluster 4; SAR11 
and methylotrophs), photosynthesis and attachment (Clusters 5 and 7; Synechococcus, 
picoeukaryotes, Verucomicrobia, and Planctomycetes), and sulfur oxidation (Cluster 7; 
Gammaproteobacteria). The seasonal patterns in activity were overlain, and sometimes 
obscured, by large differences in %RP over shorter day-night timescales. Seventy-eight 
taxa, many of them heterotrophs, had a higher %RP activity index during the day than 
night, indicating a strong diel activity rhythm at this coastal site. Emerging from these 
taxonomically- and time-resolved estimates of in situ microbial activity are predictions of 
specific ecological groupings of microbial taxa in a dynamic coastal environment. 
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INTRODUCTION 

The influence of a bacterial population on ecosystem processes 
is a function of abundance, metabolic capabilities, and activity 
rates. Linking these three characteristics at a fine taxonomic reso- 
lution in dynamic environments represents a significant challenge 
for developing a predictive framework for microbial ecology. 
Much progress has been made in quantifying microbial popu- 
lation abundances and potential function via rRNA genes and 
metagenomic surveys, but taxonomically-resolved in situ mea- 
sures of activity levels have been more difficult to obtain. Instead, 
bulk measurements of community production (such as leucine 
incorporation) or single-gene transcription measures (limited 
by sequence heterogeneity, incubation steps, or low taxonomic 
resolution) have been the typical methodologies. Recently, com- 
munity wide analysis of 16S rRNA:rDNA ratios have provided 
detailed views of microbial taxa that indicate a decoupling of 
abundance and activity (Campbell et al, 2011; Hugoni et al., 
2013; Hunt et al, 2013). However, this approach is unable to 
link taxon activity with expressed functional capabilities and can- 
not account for variations in rDNA copy number or extended 
ribosome lifetimes (Blazewicz et al, 2013). 

Increased sequencing capabilities now allow for genome-wide 
transcriptional profiles of abundant taxa (defined by similarity 



binning to the closest sequenced genome) within metatranscrip- 
tomic data sets (Gifford et al, 2013; Ottesen et al, 2013). We 
recently explored the possibility of leveraging ribosomal protein 
transcription within these reference genome bins as a proxy for 
in situ activity (Gifford et al., 2013). Ribosomal proteins are an 
essential component of a cell's translation machinery, and their 
evolutionary conservation makes them valuable for taxonomic 
identification as well. Although some taxa deviate (Blazewicz 
et al., 2013), cells generally couple translation to activity, and 
increases in RP expression have been found to correlate well with 
increased activity in all three domains of life (Eisen et al., 1998; 
Wei et al., 2001; Hendrickson et al., 2008). Previous work has 
shown that the percent of a taxon's transcriptome allotted to 
RPs provides a relative estimate of activity that agrees well with 
experimentally-determined growth rates (Gifford et al., 2013). 

Here we investigate the potential activity levels and gene 
expression patterns of bacterioplankton in a dynamic coastal 
environment over a year-long metatranscriptomic study. The 
recruitment of transcripts to 200 reference genomes in this time 
series, encompassing both short term (day-night) and long term 
(seasonal) variability, uncovered a highly dynamic community 
with distinct groups of taxa whose activity varied with environ- 
mental parameters. The functional genes expressed within these 
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groups revealed metabolic capabilities mapping to the activity 
patterns. 

MATERIALS AND METHODS 
SAMPLE COLLECTION 

Sampling occurred at Marsh Landing, Sapelo Island, Georgia, 
U.S.A. (31°25'4.08 N, 81° 1/43.26 W) as part of the Sapelo 
Island Microbial Observatory program (http://simo.marsci.uga. 
edu). Samples and environmental measurements were collected 
quarterly (2008: August 6-7, November 5-7; 2009: February 
15-17, May 13-15, August 12-14) with each sampling expedition 
occurring at four consecutive high tides, resulting in two consecu- 
tive pairs of day-night samples per season. Cell collection for RNA 
extraction was conducted as described previously (Poretsky et al., 
2009; Gifford et al., 2011). Briefly, 6-8 L of water was pumped 
directly from a depth of 1 m and passed through a 3-u,m pore-size 
prefilter (Capsule Pleated Versapor Membrane; Pall Life Sciences, 
Ann Arbor, MI, USA) and a 0.22-u,m pore-size collection filter 
(Supor polyethersulfone; Pall Life Sciences). The 0.22-u,m filter 
was placed in a WhirlPak bag and flash frozen in liquid nitrogen. 
Total time from start of filtration to flash freezing was 1 1-14 min. 

Sample processing and sequencing 

RNA processing is described in Gifford et al. (2011), including 
the addition of an internal RNA standard to calculate transcript 
abundances on a per volume basis (Moran et al., 2013; Satinsky 
et al, 2013). Briefly, 25 ng (4.7 x 10 10 copies) of the RNA stan- 
dard constructed from a pGem-3Z plasmid and the frozen filter 
were added to the bead-lysis solution and RNA was extracted 
according to RNEasy kit (Qiagen) procedures. Residual DNA 
was removed using the Turbo DNA-free kit (Applied Biosystems, 
Austin, TX, USA) and rRNAs reduced first using Epicentre's 
mRNAOnly isolation kit (Madison, WI, USA) and then with the 
MICROBExpress and MICROBEnrich kits (Applied Biosystems). 
The enriched mRNA samples were then linearly amplified 
using the MessageAmp II-Bacteria kit (Applied Biosystems) and 
double stranded cDNA synthesized with Promega's Universal 
RioboClone cDNA synthesis system and random primers. 
Residual reactants and nucleotides from cDNA synthesis were 
removed using the QIAquick PCR purification kit. Two samples 
(FN56 and 57; August 2008) were sequenced by 454 pyrosequenc- 
ing as described in Gifford et al. (2011), and 4 were sequenced 
with the Illumina GAIIX platform (described in Gifford et al., 
2013). The remaining 16 samples were sheared to ~300bp 
with an E210 ultrasonicator (Covaris, Woburn, MA, USA), size 
selected in the range of 200-400 bp with a Beckman SPRTTE 
robot (Beckman Coulter, Brea, CA, USA), and sequenced 
with Illumina GAIIX to obtain 150 x 150 bp paired end reads. 
Two samples (FN101B and FN146B) were technical replicates 
of samples FN101 and FN146, being derived from the same 
cDNA and Illumina library preparation, but were sequenced in 
independent lanes. Sequences are deposited in the CAMERA 
database (http:// camera, calit2.net/about- camera/full- datasets) 
under accessions CAM_PROJ_Sapelo2008, CAM_P_0000917, 
and CAM-P-0001108. Reads were filtered with a quality score 
cutoff >20 and a minimum length ;>100 bp. Overlapping mate 
pairs were assembled using SHERA (Rodrigue et al., 2010) with a 



score >0.5. The SHERA assembled reads accounted for 50% of all 
reads and had a mean assembled length of 180 nt. Non-assembled 
reads were not considered in the downstream analysis. 

BI0INF0RMATIC PROCESSING 

Reads from all 22 libraries were compared to a custom database 
containing small and large sub unit rRNAs (derived from the 
SILVA database, www.arb-silva.de, see Gifford et al., 2013) as well 
as the internal standard sequence using BLASTn. Reads with a bit 
score ^50 to the custom SILVA database were considered rRNA 
and removed from further analysis. Hits to the internal stan- 
dard sequence with a score of >50 were tallied and the reads 
removed from further analysis. The remaining potential protein 
encoding reads were annotated by a BLASTx homology search 
against RefSeq version 47, taking the top scoring hit with a bit 
score ;>40. Annotated reads were compiled into taxon bins based 
on the top scoring hit taxon ID from the RefSeq BLAST. The gene 
content associated with these taxon IDs is derived from isolate 
genome sequencing projects (metagenomic assemblies are not 
included) and can be in various states of completion (i.e., draft vs. 
complete). KEGG orthology (KO) and pathway information for 
the annotated reads was retrieved from the integrated microbial 
genomes database (IMG; img.jgi.doe.gov/cgi-bin/w/main.cgi). 
Proteorhodopsin genes within the top 200 transcript recruiting 
genomes were identified by a BLASTp homology search using 
three proteorhodopsin (PR) query sequences (NCBI accessions: 
254455918, 118594191, 225010551). 

RP composition 

Ribosomal protein reads (RPs) were initially identified by a text 
based query for "ribosomal protein" in the RefSeq annotation. 
To confirm the annotation, the RP reads binning to the top 200 
transcript recruiting taxa were compared to the KEGG database 
in CAMERA via BLASTx. The 1.1 million RPs binned to 9296 
different genes, of which 96% fell into KO pathway "ribosome" 
(ko03010) and 2% hit RP modification KOs. The remaining 2% 
of reads without an RP KO annotation accounted for <5% of 
any given taxonomic bin's total RP hits and were kept in the 
downstream analysis. 

For the top 200 transcript-recruiting taxonomic bins, %RP 
was calculated as the sum of all RP annotated reads in a bin 
divided by the total number of reads in the bin. Significant 
differences in %RP abundance were determined by bootstrap- 
ping. For day-night differences, the observed mean difference in 
%RP was calculated for the nine day-night pairs [the two 454 
samples had no corresponding day sample and were excluded; 
samples with a technical replicate (FN101A/B and FN146A/B) 
were averaged]. The 18 samples were then randomly assigned 
to a pair and the random mean difference calculated for 10,000 
iterations. The p-value was the number of random observa- 
tions greater than or equal to the observed mean value, with 
p-values <0.05 considered significant. The same procedure was 
used to test for significant differences between day and night 
absolute RP per L obtained by internal standard normalization 
(Satinsky et al., 2013). Statistically significant seasonal differences 
in %RP were obtained by calculating an observed sum of squares 
difference between the five seasonal groupings (summerl, fall, 
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winter, spring, summer2), and then randomly assigning samples 
to the seasonal groupings and calculating p-value as the number 
of random sum of squares > to the observed sum of squares. 

Clustering 

The top 200 transcript-recruiting taxonomic bins were clustered 
by calculating the %RP pairwise Pearson's correlation coeffi- 
cients among all the bins, converting the coefficients to a dis- 
tance matrix ([1- corr. coeff.] 12), and clustering with the hclust 
function in R using the default settings and complete linkage. 
To compare patterns in %RP and environmental parameters 
measured during the SIMO sampling, we conducted a canon- 
ical correspondence analysis (CCA) using the cca function in 
the Vegan R package (Oksanen et al., 2013) with %RP as the 
species abundance metric and the samples as sites. An over- 
all model permutation test using the ANOVA function in R 
rejected the null hypothesis that %RP abundance and measured 
environmental gradients were not related (p < 0.01 for 1000 
iterations). 

Indicator analysis 

Enrichment of a KEGG KO ortholog for a given RP cluster 
was determined using the indicator species analysis of Dufrene 
and Legendre (1997; also see Gifford et al., 2013). For individ- 
ual KEGG KOs, an indicator value (IV) was calculated for each 
activity cluster based on both fidelity (the proportion of cluster 
member samples in which the KO was expressed) and speci- 
ficity (the proportion that an RP cluster contributed to the total 
summed KO expression across all clusters). The IV was calcu- 
lated as follows: A hit was defined as a gene binning to a given 
KO being expressed in one of the 20 samples [i.e., expression 
detected (1) or not detected (0)]. The number of hits per taxon 
for a given KO exceeded 20 in some cases, as some taxa had 
multiple genes binning to the same KO. The hit count was then 
normalized by summing all the hits to a KO within an activ- 
ity cluster and dividing by the number of taxa in that cluster. 
Specificity was calculated for each activity cluster as the normal- 
ized hit count divided by the sum of all the RP cluster normalized 
hit counts. Fidelity was the total number of samples that had 
the given KO expressed in an activity cluster divided by the total 
samples (number of taxa times 20 samples). The indicator value 
was calculated as the product of fidelity and specificity multiplied 
by 100. 

Statistically significant differences in indicator values between 
activity clusters were determined by random permutations. Taxa 
were randomly assigned to an activity cluster (keeping the num- 
ber of taxa in a cluster the same) and the IV of the random clusters 
was determined. The process was repeated for 1000 iterations and 
the p-values were calculated as the proportion of random IVs 
greater than observed IVs. 

To identify diel differences in Cluster 2 expression, the indi- 
cator analysis was conducted as above except only Cluster 2 
KOs were considered and the orthologs were grouped into two 
sets, those from night samples and those from day samples. 
Significantly different indicator values between the day and night 
groups were determined through the random permutation test 
described above (p < 0.05, 10,000 iterations). 



RESULTS 

Samples were collected off the coast of Georgia, USA as part of 
the Sapelo Island Microbial Observatory quarterly sampling pro- 
gram (http://simo.marsci.uga.edu). Samples for RNA, DNA, and 
other environmental parameters were collected in triplicate over 
four high tide cycles within 48 h, resulting in samples from two 
consecutive days and nights (Table SI). Twenty RNA samples 
encompassing a period from August 2008 to August 2009 were 
sequenced, representative of the summer, fall, winter, spring, and 
second summer seasons. Six libraries were previously reported 
(Table S2): two using 454 pyrosequencing (Gifford et al, 2011) 
and four using the Illumina GAIIx platform with 100 bp sin- 
gle reads (Gifford et al, 2013). The remaining 14 samples were 
sequenced using the Illumina GAIIx platform with 150 x 150 
paired ends (Table S2; overlapping paired reads were assembled), 
with two of the samples sequenced in duplicate (i.e., technical 
replicates). The combined 253 million reads from all 22 libraries 
were compared to a custom database of SILVA rRNAs (Gifford 
et al., 2013) using BLASTn to identify and remove any resid- 
ual rRNAs. The remaining 52 million potential protein encoding 
reads were compared to NCBI's RefSeq database (version 47) 
using BLASTx. The 29 million reads that had a significant hit 
(bit score > 40) fell into 5600 reference genome bins, with the 
top 200 bins accounting for two-thirds of all annotated reads. 
Over 1.4 million reads were annotated as RPs, comprising 5% of 
all RefSeq hits. The contribution of RP transcripts to a reference 
genome's transcriptome (%RP) was used as a proxy for relative 
activity. 

RELATIVE ACTIVITY PATTERNS 

An examination of the top 200 transcript-recruiting bins (includ- 
ing all 3 domains of life) revealed high diversity in potential 
activity levels across community members (Figure 1), though 
often both the magnitude and variance in %RP were con- 
served among closely related taxa and tended to diverge with 
increased phylogenetic distance. Variance in activity across the 
seasons (summerl, winter, fall, spring, and summer2) increased 
with the magnitude of %RP, with a significant linear rela- 
tionship between mean %RP and standard deviation (p < 
0.0001, R 2 = 0.46). Thus the slow-growing taxa tended to main- 
tain their low activity indexes both within and across seasons 
(Figure 1); these included all SAR11 genomes and several other 
alphaproteobacterial groups (Rhodospirillales, Rhizobiales, and 
non-Roseobacter Rhodobacterales), as well as the two archaeal 
reference bins. In contrast, high %RP dynamics was particu- 
larly noticeable for Roseobacters, with mid to upper activity 
indexes yet high temporal variability. Bacteriodetes fell into 
two activity groups, with slightly below average indexes for the 
Cytophagales/Sphingobacteria, and distinctly higher indexes for 
Flavobacteriales; both groups had relatively low across-season 
variability. The taxon bins with the highest activity indexes 
(>10 %RP) were most often Gammaproteobacteria bins (partic- 
ularly members of the Alteromonadales and Oceanspirillaceae), 
but also included populations binning to the verrucomicro- 
bium Coraliomargarita akajimensis, the roseobacters Citreicella 
and Ketogulonicigenium, and the alphaproteobcterium Paracoccus 
denitrificans. 
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%RP 



%RP 



Verrucomicrobia / Planctomycetes 

Coral, akajimensis DSM 45221 ■■ 3 
Opitutaceae bacterium TAV2 □ □ 6 
OpitutusterraePB90--1 ■□ 3 
Verruco spinosum DSM413B ■□ 1 
Pedosphaera parvula Ellin5l4 ■□ 5 
Verruco. bacterium DG1235 ■□ 3 
Chthoniobacterflavus Ellin42a □□ 7 
LentisphaeraaraneosaHTCC2155D« 7 
Rhodopirellula baltica SH 1 ■ □ 3 
Gemmata obscuriglobus UQM 2246 □□ 7 
Plancto. brasiliensis DSM5305 ■ □ 5 
Planctomyces maris DSM8797 ■□ 4 
Pirellula staleyi DSM 6068 ■ □ 3 
Blastopirellula marina DSM3645 ■ □ 3 
B eta proteo bacteria 

Methylover. universalis FAM5 □ □ 4 
Thiobacil. denitrifican 25259 ■ □ 4 
MfVhyiophiiaies hact HTCC?181 zm 4 
beta proteobactenurr KB13 zm 4 

SAR11 

FWg bade' :. m HICC10:2 ■□ - 
Petagibade- ubiqje H I CC1062 ■ □ 4 
Peiagibacte- sp. HTCC721 1 ■ 1 1 4 
=eiagibader sp IMCX90G3 ■ ■ 4 
alpha proteobacterium HIMB1 14 ■□ 4 
Roseobacter 

Rhodobacterales sp. HTCC2255 ■ ■ 1 
Rhodobacterales bacterium Y4I □ ■ 2 
Roseibium sp. TrichSKD4 ■ □ 4 
Jannaschia sp. CCS1 □■ 2 
Oceanibulbus indolifex HEL-45 □ ■ 2 
Roseobacter denitrific. OCh1 14 □ ■ 2 
Roseobacter sp . MED193 □■ 2 
Roseobacter sp. AzwK-3b □ ■ 2 
Roseobacter sp. SK209-2-6 □■ 2 
Roseobacter sp. GAI101 □□ 2 
Roseobacter sp. CCS2 □ □ 2 
Roseobacter litoralis Och 149 □■ 2 
Lcktanella vestfoldensis SKA53 □ ■ 2 
Oceanicola granulosus HTCC2516 □■ 2 
Oceanicola batsensis HTCC2S97 □ ■ 2 
Citreicella sp. SE45 □■ 2 
Phaec. gallaeciensis DSM17395 □■ 2 
Dinoroseobacter shibae DFL 12 □■ 2 
Pelagibaca bermud. HTCC2601 □■ 2 
Maritim. alkaliphilus HTCC2654 □ ■ 2 
Octadecabacter antarcticus 238 □ ■ 2 
Octadecabacler antarcticus 307 ■ □ 1 
Rhodobacteraceae sp. HTCC2083 □ ■ 2 
Rhodobacteraceae sp. HTCC2150 □■ 2 
Rhodobacteraceae sp. KLH11 □■ 2 
Sagittuia stellata E-37 □ ■ 2 
Sulfitobacter sp. NAS-14.1 □□ 1 
Thalassiobium sp. R2A62 □■ 2 
Roseovarius nubinhibens ISM □■ 2 
Roseovarius sp. 217 □■ 2 
Roseovarius sp. TM1035 □■ 2 
Ketogulonicigenium vulgare Y25 □ ■ 2 
Ruegeria sp. TM1040 □■ 2 
Ruegeria sp. R11 □■ 2 
Ruegeria lacuscaer ITI-1157 □■ 2 
Ruegeria sp. TrichCH4B □ ■ 2 
Ruegeria pomeroyi DSS-3 □■ 2 
misc. Alphaproteobacteria 

Pseudovibrio sp. JE062 ■ □ 4 
Labrenzia aggregata I AM 12614 ■ ■ 4 
Labrenzia alexandrii DFL-11 ■□ 4 
Rhodospirillum rubrum 11 170 □ □ 4 
Rhodospirillum centenum SW ■ □ 4 
Magnetospirillum magnet. AMB-1 □□ 4 
Azospirillum sp. B510 □ ■ 1 
Roseomorias cervicalis 49957 □ □ 1 
Glucona. diazotrophicus PAI 5 □ □ 7 
Rhodobacter capsulatus SB 1 003 □ ■ 2 
Rhodobacter sphaeroides 17025 □ □ 1 
Rhodobacter sp. SW2 □ ■ 2 
Ahrensia sp. R2A1 30 ■ □ 4 
Paracoccus denitrific PD1222 □ □ 2 
Oceanicaulis sp HTCC2633 ■ □ 4 
Hirschia baltica ATCC 49814 ■ □ 4 
Maricaulis maris MCS10 ■ □ 1 
nonas neptunium ATCC15444 ■□ 4 
Novosphingobium sp. PP1Y ■ ■ 4 
Aurant. manganoxydans SI85-9A1 ■□ 4 
Fulvimarina pelagi HTCC2506 ■ □ 4 
Starkeya novella DSM 506 ■ □ 4 
Polymorphum gilvum SL003B-26A1 ■□ 4 
Bradyrhizobium sp. ORS 278 DO 4 
Bradyrhizobium japonicum 110 ■ □ 4 
Parvibaculum lavamentiv. DS-1 ■■ 4 
Hoeflea phototrophica DFL-43 □ ■ 4 
alpha proteobacterium BAL199 ■ ■ 4 
Puniceispir marinum IMCC1322 □ ■ 1 
Archaea 



Hyphor 





Gammaproteo bacte r i a 
5 Pseudomonas mendocina NK-01 1 
? Pseudomonas fulva 1 2-X ■ □ 4 
§ Pseudomonas stutzeri A1501 ■□ 1 
§ Cellvibrio japonicus Ueda107 ■ ■ 1 
TeredinibacterturneraeT7901 ■□ 1 
gamma proteobacterium IMCC1989 ■ ■ 4 
Pseu do a Hero moras tunicata D2 ■ o 4 
Colwellia psychrerythraea 34H ■ □ 1 
Ferrimonas balearica DSM 9799 □ □ 3 
Alteromonas sp. SN2 ■ □ 1 
Marinobacter algicola DG893 ■ ■ 4 
Marinobacter sp. ELB17 ■ □ 1 
Marinobacter aquaeolei VT8 ■ □ 1 
Saccharophagus degradans 2-40 ■ ■ 1 
Glaciecola sp. HTCC2999 ■ □ 1 
Glaciecola sp 4H-3-7+YE-5 ■ □ 1 
mediterr. MMB-1 ■ □ 1 
sp. MED121 □□ 1 
Marinomonas sp MWYL1 □ ■ 2 
Marinomonas sp. MA-Po-161 ■□ 3 
Nepluniibacler caesariensis □ ■ 2 
Bermanella marisrubri □ □ 7 
Kangiella koreensis DSM 16069 ■ □ 1 
Alcanivorax sp. DG88! □ □ 3 
Hahella chejuensis KCTC 2396 ■ ■ 1 
Halomonas elongate DSM 2581 □ ■ 2 
Chromohal. salexigens DSM 3043 □ ■ 2 
gamma proteobacterium HTCC2148 ■ ■ 1 
gamma proteobacterium HTCC208D □ ■ 1 
Congregibacter litoralis KT71 □ ■ 1 
gamma proteobacterium IMCC3088 ■ ■ 1 
gamma proteobacterium HTCC2143 ■ ■ 1 
gamma proteobacterium HTCC2207 ■ □ 1 
gamma proteobacterium HTCC5015 ■ o 4 
gamma proteobacterium NOR51-B ■ ■ 1 
gamma proteobacterium NOR5-3 □ ■ 1 
s gamma proteobacterium HdN1 □ ■ 2 
5 1 Methylomicrobium album BG8 ■ ■ 1 
*§- Methylomonas methanica MC09 □ □ 3 
8 1 Methylobacttundripaludum SV96 ■ □ 1 
§ Alkalilimnic. ehrlichii MLHE-1 ■ □ 6 
8 Thioal kali vibrio sp K90mix □ □ 1 
w Thioal kalivibrio sp. HL-EbGr7 ■ □ 1 
Halothiobacillus neapolit. c2 □ □ 7 
Allochromatium vinosum DSM 180 ■ ■ 4 
Nitrosococcus halophilus Nc4 ■ □ 4 
Legionella pneu, 2300/99 Alcoy one 
Acidithiobacillusferrivo SS3 □ □ 7 
Reinekea blandensis MED297 □ ■ 4 
Methylcphagathiooxyd. DMS0101D 4 
Beggiatoa sp. PS □ □ 7 
Ruthia magnifica str ■ o 7 
Endoriftia persephone Hot96_1 ■ ■ 1 
Vesicomyosocius okutenii HADD 7 
DeHaproteobacterla 



Haliangium ochraceum DSM14365 ■ 
Cyanobacteria 



□ 4 



Synechococcus sp. CC9605 □ □ 5 
Synechococcus sp. WH 8109 ■□ 5 
Synechococcus sp. RS9916 ■■ 5 
Synechococcus sp. CB0101 ■□ 5 
Synechococcus sp. CB0205 ■□ 5 
Synechococcus sp. RCC307 □ □ 5 




Summer 1 


■ 


Fall 




Winter 


■ 


Spring 


■ 


Summer 2 


■ 




Chitinophaga pinensis DSM 2588 ■ □ 4 
Rhodothermus marinus DSM 4252 □ □ 4 
Mucilaginibader paludis 18603 ■ □ 1 
Haliscomeno. hydrossis DSM1 100 ■ □ 4 
Mahvirga tractuosa DSM 4126 ■ □ 4 
Algohphagus sp PR1 □ ■ 4 
Microscilla marina ATCC 23134 □ □ 4 
Spirosoma linguale DSM 74 ■ □ 1 
Dyadobacter fermentans 1 8053 ■ □ 4 
Flavobacteria bacterium BBFL7 ■ ■ 4 
Flavobacteria bacter. MS024-2A ■ ■ 4 
Flavobacteria bacter. MS024-3C ■ ■ 1 
Fluviicola taffensis DSM 16823 ■ □ 4 
Flavobacteria bacterium BAL38 ■ ■ 1 
F lavo bade ri ales bacter. ALC-1 ■■ 4 
Capnocytophaga sp. 329 F0087 ■ □ 1 
Cellulophaga algicola DSM14237 □ ■ 2 
Cellulophaga lytica DSM 7489 □ □ 2 
Croceibacter atlant. HTCC2559 □ □ 1 
Kordia algicida OT-1 □ ■ 2 
Robiginitalea biform. HTCC2501 □■ 2 
Mahbadersp. HTCC2170 □ ■ 2 
Leeuwenhoekiella bland. MED217 □ □ 1 
Lacinutrix algicola 5H-3-7-4 ■ □ 1 
Gramella torsetii KTOS03 ■□ 4 
Krokinobacter sp. 4H-3-7-5 ■ □ 4 
Zunongwangia profunda SM-A87 □ □ 1 
Polaribacter sp MED152 ■ o 1 
Psychroflexus torquis 700755 □ □ 6 
unidentified eubacterium SCB49 □ ■ 2 



Eukaryotes 



- activity cluster 

- day/night signficant difference 

- seasonal signficant difference 



Emiliania huxleyi □ ■ 7 

Rhodomonas salina □ ■ 7 

Micromonas sp. RCC299 O ■ 7 

Micromonas pusilla CCMP1545 □□ 7 

Ostreococcus lucimar.CCE9901 □□ 5 

Ostreococcus tauri o □ 5 
Monomastix sp. OKE-1 □ ■ 

Chlamydomonas reinhardtii □□ 5 

Volvox carteri f. nagariensis ■ □ 5 

Oedogonium cardiacum □ □ 7 i 

Oltmannsiellopsis viridis □ ■ 7 ■ 

Parachlorella kessleri □ ■ 7 

Chlorella variabilis □ □ 5 

Coccomyxa sp. C-169 □ ■ 7 

Thaiassiosira ocean. CCMP1005 □□ 7 

Thalassiosira pseud. CCMP1 335 □□ 3 

Phytophthora infestans T30-4 □ □ 6 



FIGURE 1 | Seasonal variation in %RP values for the top 200 
transcript- recruiting genome bins. Error bars indicate the 95% confidence 
intervals determined by bootstrapping (1000 iterations). The activity cluster of 



each bin is given (see Figure 2 and Figure S1), and bins with significant 
variation in %RP either between seasons or between day and night samples 
are indicated. 
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These patterns in %RP are suggestive of distinctive life- 
histories among coastal community members. Two caveats for 
interpreting ribosome-related signals, however, are that a sig- 
nificant portion of protein synthesis activity may be related to 
non-growth functions (Blazewicz et al., 2013), and relationships 
between ribosome content and activity may be taxon-specific 
(Lin et al., 2013). We therefore confined our exploration of 
%RP patterns to examinations of within-taxon temporal patterns, 
focusing on the 200 highest- recruiting reference genome bins and 
using hierarchical clustering to group taxa with similar temporal 
activity patterns across the 20 time points. A bifurcating dendro- 
gram that further resolved into seven deeply branching clusters 
(hereafter referred to as activity clusters) with distinct temporal 
patterns in activity and taxonomic membership emerged from 
this analysis (Figure 2A; Figure SI). 

LINKING ACTIVITY, FUNCTION, AND ENVIRONMENTAL DYNAMICS 

A CCA relating potential activity to temporal gradients in envi- 
ronmental conditions produced similar groupings for the taxa as 
the hierarchical clustering method and indicated that members 
of the same activity cluster had similar activity optima along the 
measured environmental gradients (Figure 2B). Temperature and 
PAR explained the most variation in taxon activity. The CCA plot 
also indicated that the major bifurcation evident in the cluster 
dendrogram (Figure 2A) was related to factors that correlate with 
temperature, as taxa in Clusters 1-3 fell below the temperature 
centroid and taxa in Clusters 4-7 fell above it (Figure 2B). 

We examined expression of functional genes in the 200 refer- 
ence genomes to determine if patterns in function further united 
the activity clusters. Functional characterization was approached 
in three ways. (1) Indicator gene expression: The orthologous 
gene relationships among 168 of the bacterial genome bins were 
defined based on their KEGG KO assignments in the integrated 
microbial genomes (IMG) database (ver 3.5) and used in an 
indicator species analysis (see Methods) to identify expressed 
genes characteristic of each cluster (using a random permuta- 
tion test with 1000 iterations to determine significance; Table 
S3). These 168 genomes represent 90% of the 188 bacterial 
members within the top 200 transcript recruiting taxa. The 
20 genomes not included in this analysis were not available 
in IMG. (2) KEGG pathway enrichment: Significant indicator 
genes identified in approach #1 were assigned to KEGG path- 
ways based on their KO assignment, and pathways with sig- 
nificantly higher representation in a cluster were identified by 
permutation tests (Table S4). (3) 4.1.1. Highly expressed RefSeq 
genes: The most highly expressed RefSeq-annotated genes within 
individual genome bins were investigated for seasonal and diel 
dynamics. 

Activity Cluster 1 

Cluster 1 members generally had highest potential activity in 
the winter and spring samples (Figures 2A,B), with 72% of 
the 47 members having a significant seasonal activity pattern. 
Consistent with this cold-water bias, several psychrotolerant ref- 
erence genome bins were in this group (Glaciecola spp., Colwellia 
psychrerythraea, Polaribacter sp., and Octadecabacter antarcticus; 
Figure 1). Diel differences in activity were also apparent, with a 



third of members having significantly higher %RP in the day than 
night (Figure 3 and Table S5). The cluster was dominated taxo- 
nomically by Gammaproteobacteria (related to Alteromonadales, 
Oceanospirillales, OMG, and NOR5 groups) with additional 
members from the Alphaproteobacteria (Roseobacter, SARI 16) 
and Flavobacteria (Figure 2C). 

The indicator analysis revealed that Cluster 1 was signifi- 
cantly enriched in gene expression for flagellar biosynthesis, type 
IV pilus assembly, chemotaxis, secretion systems, and sodium 
driven transport. Biopolymer transport was also characteris- 
tic, with 90% of cluster members expressing TonB dependent 
transporters. Highly expressed RefSeq genes within these bins 
included cadherins, extracellular binding proteins, and glycosyl 
hydrolases, suggesting involvement by members of this cluster 
in attachment to and degradation of complex carbohydrates. 
Known polysaccharide degraders among the reference genome 
bins included Teredinibacter turnerae, Saccharophagus degradans, 
and Zunongwangia profunda. Cluster 1 was significantly enriched 
in taxa with PR genes (p < 0.05, permutation test, 10,000 iter- 
ations), having 13 of the 27 PR-harboring taxa in the top 200 
transcript-recruiting bins. PR was the first or second most highly 
expressed gene for the vast majority of these taxa. 

Activity Cluster 2 

Cluster 2 was distinguished by strong day-night differences in 
activity levels, with almost all high %RP samples collected in 
the day (Figures 2A, 3) and in association with high PAR levels 
(Figure 2B). The cluster is dominated by Roseobacters, with 33 
of the 37 Roseobacters in the top 200 taxa assigned to Cluster 2 
(Figure 2C). The large differences in day-night activity for this 
cluster is consistent with the high Roseobacter %RP variability 
seen in Figure 1, as both the highest (day-time) and the lowest 
(nighttime) activity occur in the fall for these taxa (Figure 3), 
with substantial day-night divergences also seen in other sea- 
sons. 

KEGG pathways significantly enriched in Cluster 2 indicator 
genes (Table S4) included several amino acid (AA) metabolism 
pathways (144 AA significant indicator genes; three times higher 
for this cluster than any other; Table S4 and Figure S3). Cluster 
2 contained more than half of all transporter indicator genes 
(Table S4), many of which were ABC transporters for amino 
acids. Indicator genes for aromatic compound degradation were 
also characteristic of Cluster 2, including those for aromatic 
amino acids as well as for a broader array of aromatic substrates 
such as benzoate (Table S4 and Figure S3). Most striking was 
a set of 14 indicator genes making up a complete degradation 
pathway that started with the aromatic compounds salicylate, 
anthranilate, and vanillate and led into the TCA cycle (Figure 
S3B). A second aromatic pathway for the degradation of pheny- 
lacetic acid was also present, composed of 10 indicator genes 
(Figure S3B). 

Cluster 2 contained half of all the aerobic anoxygenic photo- 
synthetic (AAnP) genomes in the top 200 reference genomes, and 
three AAnP- related genes (pufM, pufL, and a light harvesting pro- 
tein) were indicators. Other Cluster 2 indicator genes included 
formate dehydrogenase, DMSO reductase, mercuric redu- 
catase, phosphonate metabolism (phnGHIJM), carbon monoxide 
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FIGURE 2 | Patterns in %RP expression among genome bins and in 
relation to environmental variables. (A) The top 200 transcript-recruiting 
taxa were hierarchically clustered based on pairwise Pearson correlations of 
%RR To the right of the dendrogram, the 20 samples are arranged in rank 
order from lowest to highest %RP and colored by the sample's seasonal and 



day-night origins. See Figure SI for the same dendrogram with taxon labels 
included. (B) Canonical Correspondence Analysis (CCA) of the 200 taxa 
ordinated by %RP and environmental variables. The taxa are colored 
according to their activity cluster as shown in part (A). (C) Taxonomic 
composition of cluster members. 



dehydrogenase, and urease (Table S3). Finally, carboxylic acid 
metabolism was characteristic of the cluster, which was signifi- 
cantly enriched in KEGG pathways for glyoxylate/dicarboxylate, 
propanoate, and butanoate metabolism (Table S4). 

Activity Cluster 3 

Cluster 3 did not exhibit the strong day-night differences in activ- 
ity observed for Clusters 1 and 2, but did show strong seasonal 
differences (Figure 2 A) in which potential activity was inversely 
correlated to water temperature (increasing from summer to 



winter; Figure 2B). Taxonomically, Cluster 3 was primarily com- 
posed of relatives of Verrucomicrobia, Planctomycetes, and 
Gammaproteobacteria. Functionally, Cluster 3 expression was 
enriched for indicator genes in KEGG pathways for sugar 
metabolism (fructose/mannose; pentose/glucuronate intercon- 
versions; Table S4). Overrepresentation of glycan metabolism 
genes and sulfatase genes may indicate roles in the breakdown 
of polysaccharides (Teeling et al, 2012). Like Cluster 1, Cluster 
3 also had indicator genes for motility, chemotaxis, and secretion 
systems. 
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FIGURE 3 | Daytime enrichment in the transcriptome devoted to 
ribosomal protein synthesis. Example bins are from Activity Cluster 1 
(gamma proteobacterium HTCC2080), Cluster 2 (Roseobacter Roseovarius 



sp. TM1035), and Cluster 4 (betaproteobacterium Methylophilales 
HTCC2181). Black bars = night samples, gold bars = day samples. For the 
%RP graphs of all top 200 transcript recruiting taxa see Figure S2. 



Activity Cluster 4 

Cluster 4 was characterized by higher potential activity in the 
spring and second summer, corresponding with warmer waters 
(Figure 2B). Like Cluster 1, 80% of members had significant 
seasonal activity patterns, although this cluster was warm-water 
biased rather than cold-water biased. Day-night patterns in activ- 
ity were not consistent across the cluster, with only one-third 
of members having significant %RP day-night differences (Table 
S5). Cluster 4 was the largest cluster, containing 54 taxa repre- 
senting diverse lineages (Figure 2C). Possibly due to this diversity, 
there were few significant indicator genes that united the cluster, 
and no KEGG pathways were significantly enriched. The cluster 
was significantly enriched in PR containing taxa (p value < 0.05, 
permutation test, 10,000 iterations), harboring 12 of the 27 PR 
taxa in the top 200 taxa. 

Within Cluster 4, however, several subclusters had distinct tax- 
onomic and functional gene expression characteristics (Figure 
SI), one of which contained four of the five SARI 1 genomes. The 
populations recruiting to the SARI 1 bins had significantly higher 
%RP in the summer and spring samples compared to other sea- 
sons. Several of the most highly expressed genes within SARI 1 
bins also had seasonal dynamics (sodium symporter, V-type 
pyrophosphatases, elongation factors; Figure S4A). The SAR11 
PRs showed little temporal variation, fitting with previous obser- 
vations that this gene is often constitutively expressed by SARI 1 
members (Figure S4A) (Steindler et al, 2011; Vila-Costa et al., 
2013). 

Adjacent to the SARI Is was another subcluster containing 
proteobacteria with a distinct methylo trophy signal (Figure SI). 
Gene expression of subcluster members Betaproteobacteria KB13 
and HTCC2181 was dominated by methanol dehydrogenase (up 
to 50% of hits; Figure S4B). Although these taxa showed some 
of the highest day-night variation in potential activity over the 
entire time series (Figure 3), there was no overall significant day- 
night difference in expression of methanol dehydrogenase (f-test, 
p > 0.1); however, winter expression of this gene was consid- 
erably higher at night (Figure S4). The methylotrophy subclus- 
ter also included the alphaproteobacterium Bradyrhizobium sp. 
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FIGURE 4 | Number of taxa with maximum %RP occurring in each 
season. Bars are colored based on activity cluster assignments. 



reference genome bin, with a putative ethanol dehydrogenase 
and methanol dehydrogenase among its most highly expressed 
genes, as well as populations binning to the gammaproteobac- 
terium Methylophaga thiooxydans, also with a methanol/ethanol 
dehydrogenase as a highly-expressed gene. 

Activity Cluster 5 

Strong seasonal variation characterized Cluster 5 members' activ- 
ity (low in winter, high in both summers; Figure 2A). The 
cluster was dominated by phytoplankton taxa, including the 
picoeukaryote reference genomes Ostreococcus, Chlamydomonas, 
and Chlorella and all six Synechococcus genomes (Figure 2C). 
The Synechococcus genomes grouped tightly and showed sig- 
nificant positive correlations between %RP and temperature 
(Pearson's correlation, p < 0.05) (Figure S5). Cluster 5 had many 
photosynthesis-related indicator genes and significant enrich- 
ment of KEGG pathways for glucan, retinol, porphyrin, chloro- 
phyll, and photosystem biosynthesis (Table S4). Interestingly, 
only one of the cyanobacteria reference bins in this cluster 
(Synechococcus sp. RS9916) had significant day- night differences 
in %RP, fitting previous observations that Synechococcus diel 
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periodicity of growth- and photosystem-related transcription can 
be muted compared to other phytoplankton taxa (Ottesen et al., 
2013). In addition to the photoautotrophs, this cluster included 
two heterotrophic reference genome bins, Planctomyces brasilien- 
sis and verrucomicrobium Pedosphaera parvula. 

Activity Cluster 6 

Only 5 taxa grouped into Cluster 6, including 
Gammaproteobacteria, Bacteriodetes, and Verrucomicrobia. 
This cluster had few defining characteristics except for the general 
absence of significant day- night or seasonal differences in activity. 
There were few indicator genes and no significantly enriched 
pathways. 

Activity Cluster 7 

Cluster 7 members were diverse in the seasonal timing of their 
peak activity, but were distinguished by significantly higher 
activity in the night vs. the day (Figure 2A and Table S5). 
The CCA analysis shows Cluster 7 members widely spread 
along the temperature gradient, but all plotting below the 
median PAR (Figure 2B). The cluster was taxonomically diverse, 
including eukaryotic phytoplankton, Archaea, and several het- 
erotrophic bacteria. The eukaryotes' functional gene expression 
was dominated by photosynthesis machinery, with photosys- 
tem II transcripts accounting for two thirds of these bins and 
significantly enriched in the day (f-test, p < 0.05), a pattern sim- 
ilarly observed in metatranscriptomic studies of coastal Pacific 
phytoplankton (Ottesen et al., 2013). The two archaeal ref- 
erence genomes had relatively stable gene expression across 
seasons, despite the fact that their populations bloom in the 
late summer at this site (Hollibaugh et al., 2011, 2014). The 
heterotrophic bacterioplankton were enriched in sulfur oxi- 
dizing Gammaproteobacteria reference genome bins (Beggiatoa 
sp., Ruthia magnified, Vesicomyosocius okutanii, Halothiobacillus 
neapolitanus) with high expression of sulfur oxidation genes, 
adenylylsulfate reductase, rhodanese, and cytochromes. 

DISCUSSION 

SEASONAL SUCCESSION OF ACTIVITY 

Temperature shifts reflect the broader seasonal changes in envi- 
ronmental conditions at this site, correlating with the deepest 
divergence in microbial activity patterns that split taxa into one 
group with higher potential activity in cold weather (Clusters 
1-3) and one group with higher activity in warm weather 
(Clusters 4-7) (Figure 2). In the spring, as water temperatures 
warmed from winter lows, the system entered its period of max- 
imum primary production, coinciding with increased activity of 
phytoplankton in Clusters 5 and 7 (Figure 2). High primary pro- 
duction likely resulted in increased concentrations of labile DOM, 
driving the greater numbers and phylogenetic diversity of bac- 
terial taxa with maximum potential activity during the spring 
(Figure 4). 

Water temperatures reach their peak in summer, a period 
characterized by high inorganic nutrient availability, respiration 
rates, and bacterial abundance (Figure 2, Table SI; Hollibaugh 
et al., 2014). Both primary and bacterial production are high dur- 
ing this period, suggesting rapid cycling of matter and energy 



through the microbial loop. Taxa peaking in activity during 
summers were found almost exclusively in Clusters 4, 5, and 7 
(Figure 4), including SARI Is, methylotrophs, and Archaea with 
gene expression emphasizing the metabolism of low molecular 
weight organic compounds (methanol, acetate) and transport of 
inorganic nutrients (ammonia). The gene expression patterns are 
consistent with a streamlined life-style (Giovannoni et al., 2005), 
potentially allowing these taxa to more efficiently compete for 
labile organic matter being released directly from primary pro- 
ducers or resulting from microbial recycling. A comparison of the 
two summer seasons revealed that summer2 had higher bacte- 
rial production rates and cell concentrations and lower nutrient 
levels (Table SI), suggesting the higher percentage of taxa with 
greater %RP compared to summer 1 were responding to a more 
productive environment. 

Environmental conditions during the fall season reflected 
a transition period, with decreases from summer highs in 
water temperature, primary production, nutrient availability, and 
microbial production. We observed many taxa that had both 
their highest and lowest %RP occurring in one of the fall sam- 
ples (Figures 2A, 4). Cluster 2, in particular, was dominated by 
taxa with high fall activity, with characteristic gene expression for 
metabolism of amino acids and aromatic compounds (salicylate, 
vanillate, phenylacetic acid, and anthranilate), reflecting utiliza- 
tion of both labile and refractory substrates that might be linked 
to the initiation of vascular plants senescence in adjacent marshes. 

The system was at its most heterotrophic during the winter, 
with low water temperatures, nutrient availability, and primary 
production (Figure 2B, Table SI). Surprisingly, many bacteria 
had their maximum activity index in the winter samples, par- 
ticularly those in Clusters 1-3, potentially driven by an ability 
to processes more refractory substrates or complex carbohydrates 
during a period when labile compounds released by primary pro- 
ducers maybe in short supply. Expression of TonB dependent 
transporters, cadherins, extracellular binding proteins, and gly- 
cosyl hydrolases characterized these winter-active taxa. Motility 
and chemotaxis were also characteristic of these groups, suggest- 
ing reliance on particles or transient patches of enriched organic 
matter. 

DIEL PATTERNS OF ACTIVITY 

Eighty-seven genome bins had significantly different activity 
indexes in the day than night, many of which were found in 
Clusters 1, 2, 4, and 7 (35, 92, 30, and 41% of members, respec- 
tively). Indications of diel forcing of microbial transcription has 
been found previously in the Western English Channel (Gilbert 
et al., 2010), North Subtropical Pacific (Poretsky et al, 2009), 
and Monterey Bay (Ottesen et al., 2013). Here, we found that 
in addition to the diel dynamics of photoautotrophs, a signif- 
icant fraction of the heterotrophic community also exhibited 
day-night activity dynamics, with day-time %RP enrichment for 
78 heterotrophic bacterial genome bins (Table S5 and Figure 3). 

To characterize the transcriptional activity driving day-night 
differences in Cluster 2, which had the highest percentage of sig- 
nificant day-time %RP enriched taxa (92%), an indicator analysis 
was performed to identify gene expression characteristic of each 
time period (paired permutation test; 1000 iterations; Table S6). 
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amino acid metabolism 45 364 27 61 

ABC transporters 45 170 9 37 

mono- or dioxygenases 45 45 0 10 

TCA cycle 45 35 3 14 

porphyrin and chlorophyll me 12 43 1 14 

AAnP photosystem 12 11 0 9 

terepenoids 45 21 1 8 

carotenoid biosynthesis 30 8 1 6 

formate dehydrogenase 40 5 0 4 

motility 36 37 0 3 

chemotaxis 43 14 0 2 

DNA replication 45 26 1 4 

Cell division 45 13 1 2 

anti-,symporters 43 31 3 3 

PHAs 37 1 1 0 

heavy metal tolerance 36 1 1 0 

metal chelators 45 7 3 1 

ROS damage 45 7 3 0 

elongation factors 45 8 4 0 

protein export 45 9 4 0 

DNA repair 45 9 5 0 

Cobalt 39 10 3 0 

One carbon pool by folate 45 18 4 1 
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protease 45 28 9 1 

oxidative phosphorylation 45 54 14 1 

ribosomal proteins 45 56 33 0 
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FIGURE 5 | Cluster 2 day and night indicator genes. (A) Orthologs (KEGG 
KOs) that were indicators for day or night samples, "genomes" indicates the 
number of Cluster 2 genomes expressing orthologs in the given category 
(out of 45 total genomes in the analysis). "KOs" indicates the number of 



unique KO classifications in that category that were expressed by Cluster 2 
members. "Enriched day" or "Enriched night" indicates the number of those 
KOs significantly enriched in either the day or night samples. (B) Examples of 
individual KOs in the categories shown in (A). 



The nighttime indicator genes for Cluster 2 were highly enriched 
for AAnP-related genes: photosystems, light harvesting pro- 
teins, bacteriochlorophyll metabolism, and carotenoid biosynthe- 
sis (Figure 5). This was the strongest diel signal observed, with an 
87-fold average (1232-fold maximum) relative increase in night 
samples compared to day across 29 AAnP indicator genes (Table 
S6). While the nighttime enrichment of these phototrophy- 
related transcripts seems counter intuitive, it is in line with 
previous laboratory studies of AAnP capable Rosebacters (Yurkov 
and Beatty, 1998). Nighttime indicator genes also included those 
for central metabolism, ABC transporters, mono- and dioxy- 
geneases, and formate dehydrogenase (Figure 5). The day-time 
transcriptomes for Cluster 2 were instead characterized by genes 
for growth, repair, and energy generation (Figure 5). Indicator 
genes included nearly two-thirds of the 56 RPs, along with 
related protein synthesis machinery (RNA polymerases, tRNA 
synthetases, elongation factors, chaperones, and proteases), genes 
for energy generation via oxidative phosphorylation (F-type H+- 
transporting ATPases, cytochrome c enzymes, and NAD(P)H 
cycling), and genes for DNA repair (exonucleases, DNA ligases, 
and DNA photolyases) and antioxidant synthesis (glutathione 



peroxidase and catalase/peroxidase). Cluster 2 also included a 
number of metal transporter indicator genes, which in the day 
were biased toward cobalt (via a cobaltochelatases) and iron (via 
ferrochelatase), potentially linked to vitamin B12 biosynthesis 
and cytochrome activity, while at night were biased toward mag- 
nesium (via magnesium chelatase), potentially tied to the role of 
magnesium as a coordinating ion for bacteriochlorophyll. 

In contrast to the many day-time %RP enriched taxa, only 
nine taxa had significantly higher nighttime activity indexes, and 
these all belonged to Cluster 7, a group with high phytoplankton 
membership. This counterintuitive diel pattern coincided with 
substantial enrichment of photosynthesis-related transcripts dur- 
ing the day (up to 12-fold). For both this cluster and Cluster 2, 
we considered the possibility that strong upregulation of pho- 
totrophy gene expression would decrease the relative contribution 
of RP genes to the transcript pool, resulting in %RP changes 
that were due to shifts in non-RP transcripts rather than changes 
in RP transcripts. Internal standard mRNAs added to the sam- 
ples just prior to extraction (see Methods) allowed us to test 
this by calculating RP transcripts L _1 for each taxon and rean- 
alyzing the day-night activity patterns. Of the seven eukaryotic 
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FIGURE 6 | Seasonal and day-night variability in phytoplankton 
concentrations at Sapelo Island, Georgia, USA. (A) Three year 
time-series of chlorophyll a (chl a) concentrations. Gold and black bars 
indicate day and night samples, respectively. Su, summer; Fa, fall; Wi, 
winter; Sp, spring. (B) Higher temporal resolution chlorophyll a 
measurements (black line) and phytoplankton concentrations during a 
four-day period in Summer 2010. Green line, centric diatoms; red line, 
pennate diatoms; blue line, dinoflagellates. Night hours are shaded in gray. 



phytoplankton in Cluster 7 with significantly higher nighttime 
%RP, five were no longer significant when absolute transcript 
counts were used (Table S5 and Figure S2); only populations bin- 
ning to Emiliania huxleyi and Parachlorella kessleri still had higher 
nighttime expression on an absolute scale. When this same anal- 
ysis was carried out on Cluster 2, however, 40 of the 43 taxa with 
significantly higher daytime %RP were still significant when RP 
transcripts L _1 were used instead (Table S5 and Figure S2). In all 
the clusters combined, 62 of the 78 heterotrophic bacterial taxa 
with significant %RP day-night differences were also significant 
with RP transcripts L _1 data. This included the majority of pho- 
toheterotrophs exhibiting high expression of light capture tran- 
scripts, such as most of the AAnP-capable taxa in Clusters 1 and 
2 and the proteorhodopsin-capable taxa with enriched day-time 
%RP. Day-night shifts in activity indexes were therefore artifacts 
of non-RP gene expression (particularly the strongly upregulated 
photosynthetic machinery) for the Cluster 7 eukaryotes, but not 
for most bacteria, potentially due to smaller dynamics in cellular 
mRNA inventories in prokaryotic cells (Moran et al., 2013). 

ECOLOGICAL DRIVERS OF DIEL ACTIVITY 

The most obvious ecological driver for the 44% of taxa with 
diel transcriptional dynamics is solar radiation, for which both 
direct and indirect effects could be important. Direct light effects 
on transcriptional patterns are expected for some photoau- 
totrophic taxa (Ottesen et al., 2013), and we observed signifi- 
cant day-night dynamics in photosynthesis related transcription 
for phytoplankton in Clusters 5 and 7. Direct effects on light 
driven processes were also evident as increased transcription 
of phototrophy machinery in some AAnP heterotrophs. The 
inhibition of bacteriochlorophyll synthesis by light observed in 
early studies of AAnP cultures (Yurkov and Beatty, 1998) sug- 
gested our unexpected finding of nighttime AAnP transcription 
enrichment was also likely to be regulated by changes in solar 
radiation. Light regulation of phototrophy gene transcription, 
however, cannot fully explain the extent of the diurnal het- 
erotrophic activity observed, as only a minority (7%) of the 
200 taxa have AAnP genes, and other bacterial light capturing 
machinery did not exhibit strong diel dynamics (only 2 of 27 
PR-capable taxa showed significant day-night differences in PR 
expression). 

A second possible ecological driver of day-night differ- 
ences in heterotrophic activity is indirect propagation of light 
effects through trophic interactions with phytoplankton. We have 
consistently observed strong diel periodicity in phytoplankton 
biomass at this site based on chlorophyll a (Figure 6A) and cell 
abundance (Figure 6B) measurements. The daytime increases 
in phytoplankton might enhance opportunities for ecological 
interactions with the bacterial community, and these could be 
mediated through leakage and uptake of dissolved organic mat- 
ter, surface attachment, or nutrient competition (Amin et al., 
2012). 

This "phytoplankton interaction" hypothesis may be particu- 
larly relevant for Cluster 2, which showed the strongest and most 
coherent day-night differences in activity, and which correlated 
strongly with chlorophyll a concentrations and PAR (Figure 2B). 
Further, this cluster was highly enriched in roseobacters, which 



have been found in close association with phytoplankton and 
linked to seasonal primary production patterns (Gilbert et al, 
2011; Amin et al., 2012; Morris et al, 2012). Cluster 2 indica- 
tor genes for catabolism of amino acids and the photorespiration 
product glycolate fit with the expected composition of phyto- 
plankton exudate (Carlucci et al., 1984; Lau et al, 2007). Our 
observations of a significant nighttime enrichment of amino 
acid transport and metabolism are in line with several other 
metatranscriptomic studies (Poretsky et al, 2009; Ottesen et al., 
2013; Vila-Costa et al., 2013), and together with the nighttime 
enrichment of AAnP related transcription suggests that Cluster 2 
activities during the dark may center on synthesis of the machin- 
ery necessary to take advantage of carbon and energy sources in 
the light. 

Phytoplankton-heterotroph interactions were also suggested 
by several heterotrophic taxa whose activity patterns closely 
matched those of cyanobacteria and eukaryotic phytoplankton. 
In Cluster 5, a planctomycete and verrucomicrobium grouped 
with autotrophic Synechococcus and picoeukaryotes, and tran- 
scriptional data indicated that the two heterotrophs had high 
expression of capsular polysaccharides, twitching motility, and 
type IV secretion genes that together suggest an attachment 
lifestyle (Table S3). Members of the planctomycetes and verru- 
comicrobia have previously been shown to increase in abundance 
after phytoplankton blooms (Morris et al., 2006; Allen et al, 
2012) and were also found in Cluster 7 with other eukaryotic 
phytoplankton. 
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CONCLUSIONS 

Gene expression data assigned to hundreds of reference genome 
bins formed a picture of a complex, dynamic bacterial community 
with diverse relationships to environmental gradients. We 
approached the analysis of this complex system by using %RP 
(percent of the transcriptome devoted to RP synthesis) to cluster 
microbial groups with similar patterns in activity through time, 
and then looked within the activity clusters for commonalities in 
taxonomy, function, and relationships to environmental parame- 
ters. Temperature was a strong overall correlate with activity pat- 
terns, dividing the community into cold-biased and warm-biased 
superclusters. A significant portion of the heterotrophic bacte- 
rial community, however, had even stronger day-night activity 
dynamics, pointing to either direct solar radiation or products of 
photosynthesis as the most important activity driver. Day-night 
differences in gene expression revealed that many heterotrophic 
taxa are structuring their activities toward the synthesis of trans- 
port and metabolic genes at night, while focusing on growth, 
energy conservation, and repair during the day. Within this sea- 
sonal/diel framework, only 55 of the 200 taxa had no detectable 
temporal pattern in potential activity, suggestive of the wide 
diversity of microbial responses and ecological interactions within 
this coastal ecosystem. 
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